Install and Load Required Packages

install.packages("dplyr")
Error in install.packages : Updating loaded packages
install.packages("plotly")
Error in install.packages : Updating loaded packages
install.packages("rmarkdown")
Error in install.packages : Updating loaded packages
library(dplyr)
library(plotly)
library(rmarkdown)

Importing the Data

For the Exam, we will use the Import Dataset menu on the right instead of uploading from within the notebook.

  1. Download the appropriate data from Canvas/GitHub
  2. Click Import Dataset
  3. Select the first option: From Base (Text)
  4. Go to the Downloads folder and select your data
  5. Note the name of the dataset and click Import

Simple Linear Regression Analysis

Let’s continue to work with Exercise 1.21

y<-CH01PR21$V1      #This code renames the variables. Note that variables are labeled "V1" not "X1"
x<-CH01PR21$V2

The following code executes a simple linear regression analysis from model fitting to anova output to interactive scatterplot with line of best fit. See comments below.

Fit the model:

fit1.21<-lm(y~x)

Scatterplot:

# Create the scatterplot with a line of best fit
scatter_plot <- plot_ly(data=CH01PR21, x = ~x, y = ~y, mode = "markers", type = "scatter", name = "Data") %>%  

    add_trace(mode = "lines", type = "scatter", line = list(color = "red"), y = lm(y~x,data=CH01PR21)$fitted.values, name = "Line of Best Fit") %>%
  layout(title = "Scatterplot with Line of Best Fit",
         xaxis = list(title = "X-Axis Label"),
         yaxis = list(title = "Y-Axis Label"))

# Print the plot
print(scatter_plot)
NULL

Model summary with parameter estimates, standard errors, test statistics, p-values and more Exercise 2.6b

summary(fit1.21)

Call:
lm(formula = y ~ x)

Residuals:
   Min     1Q Median     3Q    Max 
  -2.2   -1.2    0.3    0.8    1.8 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  10.2000     0.6633  15.377 3.18e-07 ***
x             4.0000     0.4690   8.528 2.75e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.483 on 8 degrees of freedom
Multiple R-squared:  0.9009,    Adjusted R-squared:  0.8885 
F-statistic: 72.73 on 1 and 8 DF,  p-value: 2.749e-05

Parameter Estimation Exercise 2.6a

confint(fit1.21,level=0.95)
               2.5 %    97.5 %
(Intercept) 8.670370 11.729630
x           2.918388  5.081612

Let’s verify

print("Lower limit")
[1] "Lower limit"
fit1.21$coefficients[2]-qt(.025,8,lower.tail=F)*((sum(fit1.21$residuals^2)/8)/(sum((x-mean(x))^2)))^.5
       x 
2.918388 
print("Upper limit")
[1] "Upper limit"
fit1.21$coefficients[2]+qt(.025,8,lower.tail=F)*((sum(fit1.21$residuals^2)/8)/(sum((x-mean(x))^2)))^.5
       x 
5.081612 
fit1.21$residuals
   1    2    3    4    5    6    7    8    9   10 
 1.8 -1.2 -1.2  1.8 -0.2 -1.2 -2.2  0.8  0.8  0.8 

Confidence intervals for mean response and prediction intervals for future independent responses. Exercise 2.15ab

#Create a data frame with new explanatory values
newdata<-data.frame(x=c(2,4))
print("Fitted value, lower and upper bounds for confidence interval")
[1] "Fitted value, lower and upper bounds for confidence interval"
predict(fit1.21,newdata,interval="confidence",level = .99)
   fit      lwr      upr
1 18.2 15.97429 20.42571
2 26.2 21.22316 31.17684
#The two rows of out correspond to x=2 and x=4 respectively

print("Fitted value, lower and upper bounds for prediction interval")
[1] "Fitted value, lower and upper bounds for prediction interval"
predict(fit1.21,newdata,interval="predict",level = .99)
   fit      lwr      upr
1 18.2 12.74814 23.65186
2 26.2 19.16168 33.23832
summary(fit1.21)

Call:
lm(formula = y ~ x)

Residuals:
   Min     1Q Median     3Q    Max 
  -2.2   -1.2    0.3    0.8    1.8 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  10.2000     0.6633  15.377 3.18e-07 ***
x             4.0000     0.4690   8.528 2.75e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.483 on 8 degrees of freedom
Multiple R-squared:  0.9009,    Adjusted R-squared:  0.8885 
F-statistic: 72.73 on 1 and 8 DF,  p-value: 2.749e-05

Confidence band

pf121 = predict(fit1.21,newdata)
pf121
   1    2 
18.2 26.2 
n<-dim(CH01PR21)[1]
alpha<-0.01
xh<-2
W<-(2*qf(1-alpha,2,n-2))^0.5
MSE<-sum(fit1.21$residuals^2)/(n-2)
kxx<-sum((x-mean(x))^2)

print("Lower limit")
[1] "Lower limit"
pf121[1]-W*(MSE*(1/n+(xh-mean(x))^2/kxx))^.5
       1 
15.44116 
print("Upper limit")
[1] "Upper limit"
pf121[1]+W*(MSE*(1/n+(xh-mean(x))^2/kxx))^.5
       1 
20.95884 
n<-dim(CH01PR21)[1]
alpha<-0.01
xh<-4
W<-(2*qf(1-alpha,2,n-2))^0.5
MSE<-sum(fit1.21$residuals^2)/(n-2)
kxx<-sum((x-mean(x))^2)

print("Lower limit")
[1] "Lower limit"
pf121[2]-W*(MSE*(1/n+(xh-mean(x))^2/kxx))^.5
       2 
20.03104 
print("Upper limit")
[1] "Upper limit"
pf121[2]+W*(MSE*(1/n+(xh-mean(x))^2/kxx))^.5
       2 
32.36896 
(MSE*(1/n+(xh-mean(x))^2/kxx))
[1] 2.2

Anova table

anova(fit1.21)
Analysis of Variance Table

Response: y
          Df Sum Sq Mean Sq F value    Pr(>F)    
x          1  160.0   160.0  72.727 2.749e-05 ***
Residuals  8   17.6     2.2                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Multiple Linear Regression

As done above, first import the data using the menu on the right. Then assign variables:

y<-CH06PR18$V1      #This code renames the variables
x1<-CH06PR18$V2
x2<-CH06PR18$V3
x3<-CH06PR18$V4
x4<-CH06PR18$V5

Fit the Multiple Linear Regression (MLR) model:

fit6.18<-lm(y~x1+x2+x3+x4)

The Big Picture Model Summary Model summary with parameter estimates, standard errors, test statistics, p-values and more

summary(fit6.18)

Call:
lm(formula = y ~ x1 + x2 + x3 + x4)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.1872 -0.5911 -0.0910  0.5579  2.9441 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.220e+01  5.780e-01  21.110  < 2e-16 ***
x1          -1.420e-01  2.134e-02  -6.655 3.89e-09 ***
x2           2.820e-01  6.317e-02   4.464 2.75e-05 ***
x3           6.193e-01  1.087e+00   0.570     0.57    
x4           7.924e-06  1.385e-06   5.722 1.98e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.137 on 76 degrees of freedom
Multiple R-squared:  0.5847,    Adjusted R-squared:  0.5629 
F-statistic: 26.76 on 4 and 76 DF,  p-value: 7.272e-14
LS0tCnRpdGxlOiAiU1RBVCA1MTAgTUlEVEVSTSAxIgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tCgojIyBJbnN0YWxsIGFuZCBMb2FkIFJlcXVpcmVkIFBhY2thZ2VzCgpgYGB7cn0KaW5zdGFsbC5wYWNrYWdlcygiZHBseXIiKQppbnN0YWxsLnBhY2thZ2VzKCJwbG90bHkiKQppbnN0YWxsLnBhY2thZ2VzKCJybWFya2Rvd24iKQpsaWJyYXJ5KGRwbHlyKQpsaWJyYXJ5KHBsb3RseSkKbGlicmFyeShybWFya2Rvd24pCmBgYAoKIyMgKipJbXBvcnRpbmcgdGhlIERhdGEqKgoKRm9yIHRoZSBFeGFtLCB3ZSB3aWxsIHVzZSB0aGUgKipJbXBvcnQgRGF0YXNldCoqIG1lbnUgb24gdGhlIHJpZ2h0IGluc3RlYWQgb2YgdXBsb2FkaW5nIGZyb20gd2l0aGluIHRoZSBub3RlYm9vay4KCjEuICBEb3dubG9hZCB0aGUgYXBwcm9wcmlhdGUgZGF0YSBmcm9tIENhbnZhcy9HaXRIdWIKMi4gIENsaWNrICoqSW1wb3J0IERhdGFzZXQqKgozLiAgU2VsZWN0IHRoZSBmaXJzdCBvcHRpb246ICoqRnJvbSBCYXNlIChUZXh0KSoqCjQuICBHbyB0byB0aGUgRG93bmxvYWRzIGZvbGRlciBhbmQgc2VsZWN0IHlvdXIgZGF0YQo1LiAgTm90ZSB0aGUgbmFtZSBvZiB0aGUgZGF0YXNldCBhbmQgY2xpY2sgKipJbXBvcnQqKgoKIyAqKlNpbXBsZSBMaW5lYXIgUmVncmVzc2lvbiBBbmFseXNpcyoqCgpMZXQncyBjb250aW51ZSB0byB3b3JrIHdpdGggRXhlcmNpc2UgMS4yMQoKYGBge3J9Cnk8LUNIMDFQUjIxJFYxICAgICAgI1RoaXMgY29kZSByZW5hbWVzIHRoZSB2YXJpYWJsZXMuIE5vdGUgdGhhdCB2YXJpYWJsZXMgYXJlIGxhYmVsZWQgIlYxIiBub3QgIlgxIgp4PC1DSDAxUFIyMSRWMgpgYGAKClRoZSBmb2xsb3dpbmcgY29kZSBleGVjdXRlcyBhICoqc2ltcGxlIGxpbmVhciByZWdyZXNzaW9uKiogYW5hbHlzaXMgZnJvbSBtb2RlbCBmaXR0aW5nIHRvIGFub3ZhIG91dHB1dCB0byBpbnRlcmFjdGl2ZSBzY2F0dGVycGxvdCB3aXRoIGxpbmUgb2YgYmVzdCBmaXQuIFNlZSBjb21tZW50cyBiZWxvdy4KCkZpdCB0aGUgbW9kZWw6CgpgYGB7cn0KZml0MS4yMTwtbG0oeX54KQpgYGAKClNjYXR0ZXJwbG90OgoKYGBge1J9CiMgQ3JlYXRlIHRoZSBzY2F0dGVycGxvdCB3aXRoIGEgbGluZSBvZiBiZXN0IGZpdApzY2F0dGVyX3Bsb3QgPC0gcGxvdF9seShkYXRhPUNIMDFQUjIxLCB4ID0gfngsIHkgPSB+eSwgbW9kZSA9ICJtYXJrZXJzIiwgdHlwZSA9ICJzY2F0dGVyIiwgbmFtZSA9ICJEYXRhIikgJT4lICAKCiAgICBhZGRfdHJhY2UobW9kZSA9ICJsaW5lcyIsIHR5cGUgPSAic2NhdHRlciIsIGxpbmUgPSBsaXN0KGNvbG9yID0gInJlZCIpLCB5ID0gbG0oeX54LGRhdGE9Q0gwMVBSMjEpJGZpdHRlZC52YWx1ZXMsIG5hbWUgPSAiTGluZSBvZiBCZXN0IEZpdCIpICU+JQogIGxheW91dCh0aXRsZSA9ICJTY2F0dGVycGxvdCB3aXRoIExpbmUgb2YgQmVzdCBGaXQiLAogICAgICAgICB4YXhpcyA9IGxpc3QodGl0bGUgPSAiWC1BeGlzIExhYmVsIiksCiAgICAgICAgIHlheGlzID0gbGlzdCh0aXRsZSA9ICJZLUF4aXMgTGFiZWwiKSkKCiMgUHJpbnQgdGhlIHBsb3QKcHJpbnQoc2NhdHRlcl9wbG90KQoKYGBgCgoqKk1vZGVsIHN1bW1hcnkqKiB3aXRoIHBhcmFtZXRlciBlc3RpbWF0ZXMsIHN0YW5kYXJkIGVycm9ycywgdGVzdCBzdGF0aXN0aWNzLCBwLXZhbHVlcyBhbmQgbW9yZSBFeGVyY2lzZSAyLjZiCgpgYGB7cn0Kc3VtbWFyeShmaXQxLjIxKQpgYGAKCioqUGFyYW1ldGVyIEVzdGltYXRpb24gRXhlcmNpc2UgMi42YSoqCgpgYGB7cn0KY29uZmludChmaXQxLjIxLGxldmVsPTAuOTUpCmBgYAoKTGV0J3MgdmVyaWZ5CgpgYGB7cn0KcHJpbnQoIkxvd2VyIGxpbWl0IikKZml0MS4yMSRjb2VmZmljaWVudHNbMl0tcXQoLjAyNSw4LGxvd2VyLnRhaWw9RikqKChzdW0oZml0MS4yMSRyZXNpZHVhbHNeMikvOCkvKHN1bSgoeC1tZWFuKHgpKV4yKSkpXi41CnByaW50KCJVcHBlciBsaW1pdCIpCmZpdDEuMjEkY29lZmZpY2llbnRzWzJdK3F0KC4wMjUsOCxsb3dlci50YWlsPUYpKigoc3VtKGZpdDEuMjEkcmVzaWR1YWxzXjIpLzgpLyhzdW0oKHgtbWVhbih4KSleMikpKV4uNQoKZml0MS4yMSRyZXNpZHVhbHMKYGBgCgoqKkNvbmZpZGVuY2UgaW50ZXJ2YWxzKiogZm9yIG1lYW4gcmVzcG9uc2UgYW5kICoqcHJlZGljdGlvbiBpbnRlcnZhbHMqKiBmb3IgZnV0dXJlIGluZGVwZW5kZW50IHJlc3BvbnNlcy4gRXhlcmNpc2UgMi4xNWFiCgpgYGB7cn0KI0NyZWF0ZSBhIGRhdGEgZnJhbWUgd2l0aCBuZXcgZXhwbGFuYXRvcnkgdmFsdWVzCm5ld2RhdGE8LWRhdGEuZnJhbWUoeD1jKDIsNCkpCnByaW50KCJGaXR0ZWQgdmFsdWUsIGxvd2VyIGFuZCB1cHBlciBib3VuZHMgZm9yIGNvbmZpZGVuY2UgaW50ZXJ2YWwiKQpwcmVkaWN0KGZpdDEuMjEsbmV3ZGF0YSxpbnRlcnZhbD0iY29uZmlkZW5jZSIsbGV2ZWwgPSAuOTkpCiNUaGUgdHdvIHJvd3Mgb2Ygb3V0IGNvcnJlc3BvbmQgdG8geD0yIGFuZCB4PTQgcmVzcGVjdGl2ZWx5CgpwcmludCgiRml0dGVkIHZhbHVlLCBsb3dlciBhbmQgdXBwZXIgYm91bmRzIGZvciBwcmVkaWN0aW9uIGludGVydmFsIikKcHJlZGljdChmaXQxLjIxLG5ld2RhdGEsaW50ZXJ2YWw9InByZWRpY3QiLGxldmVsID0gLjk5KQoKc3VtbWFyeShmaXQxLjIxKQpgYGAKCioqQ29uZmlkZW5jZSBiYW5kKioKCmBgYHtyfQpwZjEyMSA9IHByZWRpY3QoZml0MS4yMSxuZXdkYXRhKQpwZjEyMQpgYGAKCmBgYHtyfQpuPC1kaW0oQ0gwMVBSMjEpWzFdCmFscGhhPC0wLjAxCnhoPC0yClc8LSgyKnFmKDEtYWxwaGEsMixuLTIpKV4wLjUKTVNFPC1zdW0oZml0MS4yMSRyZXNpZHVhbHNeMikvKG4tMikKa3h4PC1zdW0oKHgtbWVhbih4KSleMikKCnByaW50KCJMb3dlciBsaW1pdCIpCnBmMTIxWzFdLVcqKE1TRSooMS9uKyh4aC1tZWFuKHgpKV4yL2t4eCkpXi41CgpwcmludCgiVXBwZXIgbGltaXQiKQpwZjEyMVsxXStXKihNU0UqKDEvbisoeGgtbWVhbih4KSleMi9reHgpKV4uNQpgYGAKCmBgYHtyfQpuPC1kaW0oQ0gwMVBSMjEpWzFdCmFscGhhPC0wLjAxCnhoPC00Clc8LSgyKnFmKDEtYWxwaGEsMixuLTIpKV4wLjUKTVNFPC1zdW0oZml0MS4yMSRyZXNpZHVhbHNeMikvKG4tMikKa3h4PC1zdW0oKHgtbWVhbih4KSleMikKCnByaW50KCJMb3dlciBsaW1pdCIpCnBmMTIxWzJdLVcqKE1TRSooMS9uKyh4aC1tZWFuKHgpKV4yL2t4eCkpXi41CgpwcmludCgiVXBwZXIgbGltaXQiKQpwZjEyMVsyXStXKihNU0UqKDEvbisoeGgtbWVhbih4KSleMi9reHgpKV4uNQoKKE1TRSooMS9uKyh4aC1tZWFuKHgpKV4yL2t4eCkpCgpgYGAKCioqQW5vdmEgdGFibGUqKgoKYGBge3J9CmFub3ZhKGZpdDEuMjEpCmBgYAoKIyBNdWx0aXBsZSBMaW5lYXIgUmVncmVzc2lvbgoKQXMgZG9uZSBhYm92ZSwgZmlyc3QgaW1wb3J0IHRoZSBkYXRhIHVzaW5nIHRoZSBtZW51IG9uIHRoZSByaWdodC4gVGhlbiBhc3NpZ24gdmFyaWFibGVzOgoKYGBge3J9Cnk8LUNIMDZQUjE4JFYxICAgICAgI1RoaXMgY29kZSByZW5hbWVzIHRoZSB2YXJpYWJsZXMKeDE8LUNIMDZQUjE4JFYyCngyPC1DSDA2UFIxOCRWMwp4MzwtQ0gwNlBSMTgkVjQKeDQ8LUNIMDZQUjE4JFY1CmBgYAoKRml0IHRoZSBNdWx0aXBsZSBMaW5lYXIgUmVncmVzc2lvbiAoTUxSKSBtb2RlbDoKCmBgYHtyfQpmaXQ2LjE4PC1sbSh5fngxK3gyK3gzK3g0KQpgYGAKCioqVGhlIEJpZyBQaWN0dXJlKiogKk1vZGVsIFN1bW1hcnkqIE1vZGVsIHN1bW1hcnkgd2l0aCBwYXJhbWV0ZXIgZXN0aW1hdGVzLCBzdGFuZGFyZCBlcnJvcnMsIHRlc3Qgc3RhdGlzdGljcywgcC12YWx1ZXMgYW5kIG1vcmUKCmBgYHtyfQpzdW1tYXJ5KGZpdDYuMTgpCmBgYAo=